Privacy-Preserving Publishing Frequent Sequential Patterns

نویسنده

  • Huidong Jin
چکیده

Releasing frequent sequential patterns can compromise individual privacy of underlying sequences. We propose two concrete objectives as a potential standard for privacy-preserving publishing sequential patterns: k-anonymity and α-dissociation. The first one, extended from k-anonymity model for data, addresses the problem of inferring patterns with very low support, say, in [1, k) where k is an anonymity threshold. These inferred patterns can act as quasi-identifiers in linking attacks. We show theoretically that, for all but one definition of support, it is impossible to reliably infer support values for patterns with two or more negative items (items do not occur in a pattern) based solely on frequent sequential patterns. We formulate possible privacy inference channels for the remaining definition of support. The This paper is an extension to the PAKDD’07 conference paper [1]. The paper gives comprehensive description and theoretical analyses. In particular, Theorems 2-6 and their proof are added, and related work, more examples and experimental results are included. The author would like to thank J. Chen, H. He, C. M. O’Keefe and his other colleagues in CSIRO, Australia for their comments and help for this work. Partial financial support for this work from NICTA is acknowledged. NICTA is funded by the Australian Government’s Backing Australia’s Ability initiative, in part through the Australian Research Council. NICTA Canberra Lab, Locked Bag 8001, ACT 2601 Australia. Email: [email protected]. Tel: +61 2 61259500. Fax: +61 2 61258660. The work was partially done when he was with CSIRO Mathematical and Information Sciences, Canberra ACT 2601, Australia. RSISE, the Australian National University, Canberra ACT, 0200, Australia.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy-Preserving Sequential Pattern Release

We investigate situations where releasing frequent sequential patterns can compromise individual’s privacy. We propose two concrete objectives for privacy protection: k-anonymity and α-dissociation. The first addresses the problem of inferring patterns with very low support, say, in [1, k). These inferred patterns can become quasi-identifiers in linking attacks. We show that, for all but one de...

متن کامل

Mining Frequent Patterns with Differential Privacy

The mining of frequent patterns is a fundamental component in many data mining tasks. A considerable amount of research on this problem has led to a wide series of efficient and scalable algorithms for mining frequent patterns. However, releasing these patterns is posing concerns on the privacy of the users participating in the data. Indeed the information from the patterns can be linked with a...

متن کامل

Privacy Preserving Data Mining of Sequential Patterns for Network Traffic Data

As a total amount of traffic data in networks has been growing at an alarming rate, many researches to mine traffic data with the purpose of getting useful information are currently being performed. However, since network traffic data contain the information about Internet usage patterns of users, network users’ privacy can be compromised during the mining process. In this paper, we propose an ...

متن کامل

Differentially Private Trajectory Data Publication

With the increasing prevalence of location-aware devices, trajectory data has been generated and collected in various application domains. Trajectory data carries rich information that is useful for many data analysis tasks. Yet, improper publishing and use of trajectory data could jeopardize individual privacy. However, it has been shown that existing privacy-preserving trajectory data publish...

متن کامل

Hiding Emerging Patterns with Local Recoding Generalization

When data is published to public, it is vastly preferable to publish meaningful data and yet protect embedded sensitive patterns. This process is often referred to as privacy preserving data publishing (PPDP). In this paper, we investigate PPDP in the context of frequent itemsets mining – one of the fundamental data-mining concepts – and emerging patterns (EPs) – patterns that have a high class...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007